Evaluating Branch Predictors on an SMT Processor

نویسندگان

  • David Mulvihill
  • Matthew Allen
چکیده

Simultaneous multithreading (SMT) provides significant increases in microprocessor throughput by issuing instructions from multiple threads per clock cycle. SMT can be realized in a wide-issue superscalar with a modest increase in resources, because much of the hardware is shared among the multiple thread contexts. Branch prediction accuracy, a key component of microprocessor performance, can suffer greatly from destructive interference caused by multiple threads sharing the same branch prediction hardware. Although SMT processors are able to hide branch latencies by scheduling instructions from other threads, branch prediction accuracy is still important in SMT processors to reduce latencies of individual threads and to provide a diverse mix of threads to the instruction scheduler. This paper evaluates several wellknown branch prediction schemes, and examines some modifications to address problems caused by sharing of branch prediction hardware in SMT.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CASH: Revisiting Hardware Sharing in Single-Chip Parallel Processors

As the increasing of issue width has diminishing returns with superscalar processor, thread parallelism with a single chip is becoming a reality. In the past few years, both SMT (Simultaneous MultiThreading) and CMP (Chip MultiProcessor) approaches were first investigated by academics and are now implemented by the industry. In some sense, CMP and SMT represent two extreme design points. In thi...

متن کامل

An Effective Bypass Mechanism to Enhance Branch Predictor for SMT Processors

Unlike traditional superscalar processors, Simultaneous Multithreaded processor can explore both instruction level parallelism and thread level parallelism at the same time. With a same fetch width, SMT fetches instructions from a single thread not so deeply as in traditional superscalar processor. Meanwhile, all the instructions from different threads share the same Function Unites in SMT. All...

متن کامل

Tolerating Branch Predictor Latency on SMT

Simultaneous Multithreading (SMT) tolerates latency by executing instructions from multiple threads. If a thread is stalled, resources can be used by other threads. However, fetch stall conditions caused by multi-cycle branch predictors prevent SMT to achieve all its potential performance, since the flow of fetched instructions is halted. This paper proposes and evaluates solutions to deal with...

متن کامل

A latency-conscious SMT branch prediction architecture

Executing multiple threads has proved to be an effective solution to partially hide latencies that appear in a processor. When a thread is stalled because a long-latency operation is being processed, like a memory access or a floatingpoint calculation, the processor can switch to another context so that another thread can take advantage of the idle resources. However, fetch stall conditions cau...

متن کامل

Evaluation of dynamic branch predictors for modern ILP processors

Modern instruction-level parallel (ILP) processors use superscalar architectures with deep pipelines in order to execute multiple instructions per cycle. The frequency and behavior of branch instructions seriously hinder performance of ILP processors. Various mechanisms, both at the compiler, as well as the processor level, have been proposed to predict the branch behavior. This work investigat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002